Skip to content

Adding fmaddsub#1136

Merged
serge-sans-paille merged 1 commit intoxtensor-stack:masterfrom
DiamonDinoia:adding-fmaddsub
Jul 2, 2025
Merged

Adding fmaddsub#1136
serge-sans-paille merged 1 commit intoxtensor-stack:masterfrom
DiamonDinoia:adding-fmaddsub

Conversation

@DiamonDinoia
Copy link
Contributor

Hi all,

I have added support and tests for fmaddsub since it is useful in my scientific computations.
I tested it on x86 but I do not have access to arm/neon.

I proposed fmas but I have no strong opinion on the name of the function I picked something that seems to fit the style.

Let me know what you think and if/how we can get the new API included!

Thanks,
Marco

static constexpr bool get(unsigned const i, unsigned) noexcept { return (i & 1u) == 0; }
};
const auto mask = make_batch_bool_constant<T, even_lane, A>();
return fma(x, y, select(mask, neg(z), z));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is select a better option compared to a multiply?

Copy link
Contributor Author

@DiamonDinoia DiamonDinoia Jul 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think multiply is overly complicated. If we extend batch_constant to double (remove bitwise operators) then it might be worth it.

struct imag_neg {
        static constexpr int get(const unsigned i, unsigned) noexcept { return (i & 1u) ? 1.0 : -1.0; }
    };

    // Generator first, then arch
    const auto mask = xsimd::batch_cast<double>(xsimd::make_batch_constant<xsimd::as_integer_t<double>, imag_neg, arch>().as_batch());

Also, I do not see a big difference on my machine:
image

(swizzle/multiply vs swizzle/select)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for going all the way down to the experiment!

@serge-sans-paille serge-sans-paille merged commit a64668d into xtensor-stack:master Jul 2, 2025
63 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants